Unify SolrIndexSearcher paths with CollectorManager integration#4487
Unify SolrIndexSearcher paths with CollectorManager integration#4487markrmiller wants to merge 1 commit into
Conversation
…ased entry point Replaces the dual sequential/parallel branches in getDocListNC and getDocListAndSetNC with a single private searchAndCollect helper that internally selects between single-slice and multi-slice execution. The top-level methods no longer fork on MultiThreadedSearcher.allowMT. Key changes: - Extracts TopDocsCM, MaxScoreCM, DocSetCM, FixedBitSetCollector, SearchResult, TopDocsResult, and MaxScoreResult into a new SolrCollectorManagers utility. Adds PostFilterCM<T> (single-slice; setLastDelegate per newCollector, complete() at reduce) and DocSetCollectorCM (preserves the SortedIntDocSet sparse optimization). Multi-slice paths keep the FixedBitSet-based DocSetCM. - Deletes MultiThreadedSearcher; allowMT is inlined as the canParallelize check inside searchAndCollect. - Migrates the collector chain (segment-terminate-early, max-hits early termination, optional post-filter, optional cancellation) to a CollectorManager-based ChainCM applied per newCollector(). Single-slice vs multi-slice is chosen by ChainCM.requiresSingleSlice() OR the caller's parallelEligible hint. - Replaces the outer CancellableCollector with a shared AtomicBoolean polled by every per-slice CancellableCollector, so a single tracker.cancel(queryID) is observed across all parallel slices. - ChainCM.getScoreMode() returns the chain-wrapped collector's score mode so a Solr post-filter override (e.g. CollapsingPostFilter forcing COMPLETE) reaches populateScoresIfNeeded. Guaranteed non-null at the call site: Lucene's IndexSearcher.search calls newCollector() unconditionally before its empty-index short-circuit, so the call site uses Objects.requireNonNull rather than an unreachable fallback. - TopDocsCM.reduce skips TopDocs.merge for a single input, preserving the underlying TopDocsCollector's EQUAL_TO / GREATER_THAN_OR_EQUAL_TO relation exactly. - Aligns SolrMultiCollectorManager with Lucene's MultiCollector: per-child CollectionTerminatedException handling, per-child setMinCompetitiveScore aggregation (MinCompetitiveScoreAwareScorable), and removal of the EarlyTerminatingCollector wrap (ChainCM owns it, with a shared LongAdder across slices to preserve cross-slice maxHitsAllowed semantics). Clamps EarlyTerminatingCollector chunkSize to >= 1 to avoid divide-by-zero when maxDocsToCollect < 10. A few hot-path alignments with Lucene's MultiCollector pattern that together avoid a regression on heavy multi-term boolean OR queries: - searchAndCollect skips SolrMultiCollectorManager entirely when only one inner CollectorManager is built (top-N without fl=score and without docSet — the common case), routing through a new SolrCollectorManagers.wrapSingle adapter. Mirrors Lucene's own MultiCollectorManager.wrap() short-circuit for the one-collector case. - SolrMultiCollectorManager$Collectors$LeafCollectors.collect now matches MultiCollector.MultiLeafCollector#collect exactly: the per-iteration body is load-slot / null-check / call, and "all children terminated" is detected lazily inside the rare CollectionTerminatedException catch via allLeavesTerminated(). - QueryCancelledException and EarlyTerminatingCollectorException override fillInStackTrace() to return this, matching CollectionTerminatedException — these are control-flow exceptions and should not pay for stack-trace fill. No behavior change: numFound, ranked doc_ids, maxScore, numFoundExact and start match the parent commit on every probed (query, mode, multiThreaded) cell, including post-filter, STE, and post-filter + STE combinations. Verified with the full org.apache.solr.search.* suite plus targeted runs for TestMultiThreadedSearcher, TestEarlyTerminatingQueries, TestQueryLimits, CursorPagingTest, TestRankQueryPlugin, TestCollapseQParserPlugin, TestRandomCollapseQParserPlugin, and TestReRankQParserPlugin.
dsmiley
left a comment
There was a problem hiding this comment.
Very impressive... this is really an overhaul of multithreaded collection/search, not just a unification.
Glad to see benchmarks! But I'm surprised to see that solr/benchmark (JMH) wasn't used... why not? I'm also having trouble interpreting what you posted.
100 segments is a lot; not very realistic
| } | ||
| } | ||
|
|
||
| static class FixedBitSetCollector extends SimpleCollector { |
There was a problem hiding this comment.
could use a bit more docs/comments to explain what's going on
There was a problem hiding this comment.
surprisingly a net increase in LOC ... but more documentation so that's good
There was a problem hiding this comment.
So it seems this copies a lot of code from Lucene. What needs to change in Lucene's MultiCollectorManager and MultiCollector such that we would be able to extend their classes without sacrificing our functionality?
Replaces the dual sequential/parallel branches in getDocListNC and getDocListAndSetNC with a single private searchAndCollect helper that internally selects between single-slice and multi-slice execution. The top-level methods no longer fork on MultiThreadedSearcher.allowMT. Also makes canceling a query when doing parallel segment search actually do something.
Performance — this commit vs parent
1.5M docs / 150 segments, 300 samples per cell per build, combined from two independent round-robin sessions (parent and this commit runs alternated within each session against a single shared
solr-home). 30 warmup + 30 timed runs per cell per round.sudo powermetrics --samplers cpu_power,thermalran in parallel at 1s cadence for both sessions; thermal verification below.Negative delta = this commit faster than parent.
sudo powermetrics --samplers cpu_power,thermal -i 1000ran in parallel for both sessions (~13 min each).No
Fair/Serious/Criticalevents for any sample in any run across either session.